A common approach in designing parallel languages is to provide some high level handles to manipulate the use of the parallel platform. This exposes some aspects of the target platform, for example, shared vs. distributed memory or task parallelism vs. data parallelism. Depending on the circumstances, these handles might be too powerful, not powerful enough or impose unnecessary constraints. Many potential programmers avoid these languages because they require too much understanding of parallelism. Many applications written this way don't come close to achieving their potential performance. Applications written in one such language often need to be rewritten for a parallel platform that is sufficiently different.
Our goal is to significantly widen the set of programmers who can program for parallel targets by providing an approach that doesn't require any understanding of parallelism, while maximizing both the flexibility for optimization and the potential for reuse of the same program.
Instead of viewing the language design problem as one of providing the programmer with high level handles, we view the problem as one of designing an interface. On one side of this interface is the programmer (domain expert) who knows the application but needs no knowledge of any aspects of the platform. On the other side of the interface is the performance expert (programmer or program) who demands maximal flexibility for optimizing the mapping to a wide range of target platforms (parallel / serial, shared / distributed, homogeneous / heterogeneous, etc.) but needs no knowledge of the domain.
The separation of the description of the algorithm from the description of the mapping has several benefits. It allows the description of the algorithm to be stable across a much wider range of platforms. It allows for development of the algorithm by a domain expert who has no understanding or interest in parallelism. It allows for tuning by a performance expert who has little or no understanding of the domain. It facilitates independent development of the algorithm and the tuning even if they are performed by the same individual.